11 research outputs found

    teaMPI---replication-based resiliency without the (performance) pain.

    Get PDF
    In an era where we can not afford to checkpoint frequently, replication is a generic way forward to construct numerical simulations that can continue to run even if hardware parts fail. Yet, replication often is not employed on larger scales, as naïvely mirroring a computation once effectively halves the machine size, and as keeping replicated simulations consistent with each other is not trivial. We demonstrate for the ExaHyPE engine—a task-based solver for hyperbolic equation systems—that it is possible to realise resiliency without major code changes on the user side, while we introduce a novel algorithmic idea where replication reduces the time-to-solution. The redundant CPU cycles are not burned “for nothing”. Our work employs a weakly consistent data model where replicas run independently yet inform each other through heartbeat messages whether they are still up and running. Our key performance idea is to let the tasks of the replicated simulations share some of their outcomes, while we shuffle the actual task execution order per replica. This way, replicated ranks can skip some local computations and automatically start to synchronise with each other. Our experiments with a production-level seismic wave-equation solver provide evidence that this novel concept has the potential to make replication affordable for large-scale simulations in high-performance computing

    ExaHyPE: An engine for parallel dynamically adaptive simulations of wave problems

    Get PDF
    ExaHyPE (“An Exascale Hyperbolic PDE Engine”) is a software engine for solving systems of first-order hyperbolic partial differential equations (PDEs). Hyperbolic PDEs are typically derived from the conservation laws of physics and are useful in a wide range of application areas. Applications powered by ExaHyPE can be run on a student’s laptop, but are also able to exploit thousands of processor cores on state-of-the-art supercomputers. The engine is able to dynamically increase the accuracy of the simulation using adaptive mesh refinement where required. Due to the robustness and shock capturing abilities of ExaHyPE’s numerical methods, users of the engine can simulate linear and non-linear hyperbolic PDEs with very high accuracy. Users can tailor the engine to their particular PDE by specifying evolved quantities, fluxes, and source terms. A complete simulation code for a new hyperbolic PDE can often be realised within a few hours — a task that, traditionally, can take weeks, months, often years for researchers starting from scratch. In this paper, we showcase ExaHyPE’s workflow and capabilities through real-world scenarios from our two main application areas: seismology and astrophysics

    The EU Center of Excellence for Exascale in Solid Earth (ChEESE): Implementation, results, and roadmap for the second phase

    Get PDF
    publishedVersio

    Towards a deeper understanding of hybrid programming

    No full text
    With the end of Dennard scaling, future high performance computers are expected to consist of distributed nodes that comprise more cores with direct access to shared memory on a node. However, many parallel applications still use a pure message-passing programming model based on the message-passing interface (MPI). Thereby, they potentially do not make optimal use of shared memory resources. The pure message-passing approach---as argued in this work---is not necessarily the best fit to current and future supercomputing architectures. In this thesis, I therefore present a detailed performance analysis of so-called hybrid programming models that aim at improving performance by combining a shared memory model with the message-passing model on current symmetric multiprocessor (SMP) systems. First, inter-node communication performance is investigated in the context of (hybrid) message-passing programs. A novel performance model for estimating communication performance on current SMP nodes is presented. As is demonstrated, in contrast to the typically used classic postal performance model, the new model allows to more accurately predict inter-node communication performance in the presence of simultaneously communicating processes and saturation of the network interface controller on current multicore architectures. The implications of the new model on hybrid programs are discussed. In addition, I demonstrate the (current) difficulties of multithreaded MPI communication based on results obtained for a multithreaded ping pong benchmark. Moreover, I show how intra-node MPI communication performance can significantly be improved upon for small to medium size messages by saving message-passing overhead and/or superior cache usage. This is achieved through a direct copy in shared memory using either the hybrid MPI+MPI or the MPI+OpenMP programming method. Furthermore, I contrast and evaluate several (pure and hybrid) implementation options for a structured grid sparse matrix-vector multiplication in depth. These choices differ in how hybrid parallelism is exploited at the application level (coarse-grained vs. fine-grained problem decomposition) and with respect to the hybrid programming systems (pure MPI vs. MPI+MPI vs. MPI+OpenMP). I discuss their performance factors such as locality, overhead, efficient use of MPI's derived datatypes, and the serial fraction in Amdahl's law. Moreover, I experimentally demonstrate how a coarse-grained hybrid application design can be used to control these factors, resulting in significant performance improvements (compared to a pure MPI parallelization) in communication and/or synchronization for both the hybrid MPI+MPI and MPI+OpenMP parallel programming approaches for different grid decompositions.U of I OnlyGraduate College Thesis Office approved request from author to change restriction to U of I Access for a period of two years. Implemented by [email protected] on 2018-08-08 at 11:35 AM CD

    Doubt and Redundancy Kill Soft Errors---Towards Detection and Correction of Silent Data Corruption in Task-based Numerical Software

    Get PDF
    Resilient algorithms in high-performance computing are subject to rigorous non-functional constraints. Resiliency must not increase the runtime, memory footprint or I/O demands too significantly. We propose a task-based soft error detection scheme that relies on error criteria per task outcome. They formalise how “dubious” an outcome is, i.e. how likely it contains an error. Our whole simulation is replicated once, forming two teams of MPI ranks that share their task results. Thus, ideally each team handles only around half of the workload. If a task yields large error criteria values, i.e. is dubious, we compute the task redundantly and compare the outcomes. Whenever they disagree, the task result with a lower error likeliness is accepted. We obtain a self-healing, resilient algorithm which can compensate silent floating-point errors without a significant performance, I/O or memory footprint penalty. Case studies however suggest that a careful, domain-specific tailoring of the error criteria remains essential

    Lightweight Task Offloading Exploiting MPI Wait Times for Parallel Adaptive Mesh Refinement

    Get PDF
    Balancing the workload of sophisticated simulations is inherently difficult, since we have to balance both computational workload and memory footprint over meshes that can change any time or yield unpredictable cost per mesh entity, while modern supercomputers and their interconnects start to exhibit fluctuating performance. We propose a novel lightweight balancing technique for MPI+X to accompany traditional, prediction‐based load balancing. It is a reactive diffusion approach that uses online measurements of MPI idle time to migrate tasks temporarily from overloaded to underemployed ranks. Tasks are deployed to ranks which otherwise would wait, processed with high priority, and made available to the overloaded ranks again. This migration is nonpersistent. Our approach hijacks idle time to do meaningful work and is totally nonblocking, asynchronous and distributed without a global data view. Tests with a seismic simulation code developed in the ExaHyPE engine uncover the method's potential. We found speed‐ups of up to 2‐3 for ill‐balanced scenarios without logical modifications of the code base and show that the strategy is capable to react quickly to temporarily changing workload or node performance

    teaMPI---replication-based resiliency without the (performance) pain

    Get PDF
    In an era where we can not afford to checkpoint frequently, replication is a generic way forward to construct numerical simulations that can continue to run even if hardware parts fail. Yet, replication often is not employed on larger scales, as naïvely mirroring a computation once effectively halves the machine size, and as keeping replicated simulations consistent with each other is not trivial. We demonstrate for the ExaHyPE engine—a task-based solver for hyperbolic equation systems—that it is possible to realise resiliency without major code changes on the user side, while we introduce a novel algorithmic idea where replication reduces the time-to-solution. The redundant CPU cycles are not burned “for nothing”. Our work employs a weakly consistent data model where replicas run independently yet inform each other through heartbeat messages whether they are still up and running. Our key performance idea is to let the tasks of the replicated simulations share some of their outcomes, while we shuffle the actual task execution order per replica. This way, replicated ranks can skip some local computations and automatically start to synchronise with each other. Our experiments with a production-level seismic wave-equation solver provide evidence that this novel concept has the potential to make replication affordable for large-scale simulations in high-performance computing
    corecore